-
-
Notifications
You must be signed in to change notification settings - Fork 3
feat: support for Spark 4 #589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…late run up to stackabletech/operator-templating@e3ffe71 (#590) Reference-to: stackabletech/operator-templating@e3ffe71 (chore: SDP 25.7 templating updates)
* feat(helm): Add RBAC rule for automatic cluster domain detection * chore: Bump stackable-operator to 0.94.0 and update other dependencies * chore: Update changelog * chore: Add sparkhistory and shs shortnames
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just minor text stuff. Tests to come.
A CHANGELOG entry is missing - not sure if it is needed as this is basically all test & doc changes.
|
||
=== Maven packages | ||
|
||
The last and most flexible way to provision dependencies is to use the built-in `spark-submit` support for Maven package coordinates. | ||
The downside of this method is that job dependencies are downloaded every time the job is submitted and this has several implications you must be aware of. | ||
For example, the job submission time will be longer than with the other methods |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, the job submission time will be longer than with the other methods | |
For example, the job submission time will be longer than with the other methods. |
The downside of this method is that job dependencies are downloaded every time the job is submitted and this has several implications you must be aware of. | ||
For example, the job submission time will be longer than with the other methods | ||
Network connectivity problems may lead to job submission failures. | ||
And finally, not all type of dependencies can be provisioned this way. Most notably, JDBC drivers cannot be provisioned this way since the JVM will only look for them at startup time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And finally, not all type of dependencies can be provisioned this way. Most notably, JDBC drivers cannot be provisioned this way since the JVM will only look for them at startup time. | |
And finally, not all type of dependencies can be provisioned this way. | |
Most notably, JDBC drivers cannot be provisioned this way since the JVM will only look for them at startup time. |
If you need access to JDBC sources from your Spark application, consider building your own custom Spark image as shown above. | ||
As mentioned above, not all dependencies can be provisioned this way. | ||
JDBC drivers are notorious for not being supported by this method but other types of dependencies may also not work. | ||
If a jar file can be provisioned using it's Maven coordinates or not, depends a lot on the way it is loaded by the JVM. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a jar file can be provisioned using it's Maven coordinates or not, depends a lot on the way it is loaded by the JVM. | |
If a jar file can be provisioned using its Maven coordinates or not, depends a lot on the way it is loaded by the JVM. |
@@ -3,5 +3,6 @@ | |||
// Stackable Platform documentation. | |||
// Please sort the versions in descending order (newest first) | |||
|
|||
- 4.0.0 (Hadoop 3.4.1, Scala 2.13, Python 3.11, Java 17) (Experimental) | |||
- 3.5.5 (Hadoop 3.3.4, Scala 2.12, Python 3.11, Java 17) (Deprecated) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 3.5.5 (Hadoop 3.3.4, Scala 2.12, Python 3.11, Java 17) (Deprecated) | |
- 3.5.6 (Hadoop 3.3.4, Scala 2.12, Python 3.11, Java 17) (LTS) |
@@ -3,5 +3,6 @@ | |||
// Stackable Platform documentation. | |||
// Please sort the versions in descending order (newest first) | |||
|
|||
- 4.0.0 (Hadoop 3.4.1, Scala 2.13, Python 3.11, Java 17) (Experimental) | |||
- 3.5.5 (Hadoop 3.3.4, Scala 2.12, Python 3.11, Java 17) (Deprecated) | |||
- 3.5.6 (Hadoop 3.3.4, Scala 2.12, Python 3.11, Java 17) (LTS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 3.5.6 (Hadoop 3.3.4, Scala 2.12, Python 3.11, Java 17) (LTS) | |
- 3.5.5 (Hadoop 3.3.4, Scala 2.12, Python 3.11, Java 17) (Deprecated) |
🟢 Local tests (nightly suite) are all good. |
Description
Part of: #586
Depends on the corresponding image pr stackabletech/docker-images#1216
Spark 4 is considered experimental because of the following issues:
The integration tests have been updated to exclude spark 4 for the tests known to cause problems.
Definition of Done Checklist
Author
Reviewer
Acceptance
type/deprecation
label & add to the deprecation scheduletype/experimental
label & add to the experimental features tracker